The Statistics of Sequence Similarity Scores

ثبت نشده

چکیده

To assess whether a given alignment constitutes evidence for homology, it helps to know how strong an alignment can be expected from chance alone. In this context, "chance" can mean the comparison of (i) real but non-homologous sequences; (ii) real sequences that are shuffled to preserve compositional properties [1-3]; or (iii) sequences that are generated randomly based upon a DNA or protein sequence model. Analytic statistical results invariably use the last of these definitions of chance, while empirical results based on simulation and curvefitting may use any of the definitions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A unified statistical framework for sequence comparison and structure comparison.

We present an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both sequence and structure. Doing so involves an all-vs.-all comparison of protein domains [taken here from the Structural Classification of Proteins (scop) database] and then fitting a simple distribution function to the observed scores. By using thi...

متن کامل

A generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences

The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...

متن کامل

Empirical statistical estimates for sequence similarity searches.

The FASTA package of sequence comparison programs has been modified to provide accurate statistical estimates for local sequence similarity scores with gaps. These estimates are derived using the extreme value distribution from the mean and variance of the local similarity scores of unrelated sequences after the scores have been corrected for the expected effect of library sequence length. This...

متن کامل

Statistics of local multiple alignments

SUMMARY BLAST statistics have been shown to be extremely useful for searching for significant similarity hits, for amino acid and nucleotide sequences. Although these statistics are well understood for pairwise comparisons, there has been little success developing statistical scores for multiple alignments. In particular, there is no score for multiple alignment that is well founded and treated...

متن کامل

Sequence analysis of ORF94 in different White Spot Syndrome Virus (WSSV) isolates of Iran

White spot syndrome virus (WSSV) is a pathogen that causes high mortality in shrimp culture in the whole world. Sequence analysis of WSSV has shown similarity of WSSV isolates in different countries with exception of a few variable genomic loci. This study investigated the sequence variation of some Iranian WSSV isolates and previously identified isolates. Samples were collected during target ...

متن کامل

Molecular characterization of apolipoprotein A-I from the skin mucosa of Cyprinus carpio

Apolipoprotein A-I is the most abundant protein in Cyprinus carpio plasma that plays an important role in lipid transport and protection of the skin by means of its antimicrobial activity. A 527 bp cDNA fragment encoding C terminus part of apoA-I from the skin mucosa of common carp was isolated using RT-PCR. After GenBank database searching, a partial sequence containing a coding sequence (CDS)...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

The Statistics of Sequence Similarity Scores

ثبت نشده

چکیده

منابع مشابه

A unified statistical framework for sequence comparison and structure comparison.

A generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences

Empirical statistical estimates for sequence similarity searches.

Statistics of local multiple alignments

Sequence analysis of ORF94 in different White Spot Syndrome Virus (WSSV) isolates of Iran

Molecular characterization of apolipoprotein A-I from the skin mucosa of Cyprinus carpio

عنوان ژورنال:

اشتراک گذاری